Skip to main content
Accelerate your development with our API within a minute.
Qubrid AI simplifies the process of integrating high-performance open-source models, allowing you to run inference with just a few lines of code.

1. Register for an account

Begin by creating an account to obtain your unique API key. Once your account is active, configure your environment by exporting your key as a variable named QUBRID_API_KEY:
Shell
export QUBRID_API_KEY=xxxxx

2. Run your first Model Inference

Select the model you wish to run. For this demonstration, we will utilize OPENAI GPT OSS 120B with streaming enabled to show real-time token generation.
import requests
import json
from pprint import pprint

url = "https://platform.qubrid.com/api/v1/qubridai/chat/completions"
headers = {
"Authorization": "Bearer <QUBRID_API_KEY>",
"Content-Type": "application/json"
}

data = {
"model": "openai/gpt-oss-120b",
"messages": [
{
  "role": "user",
  "content": "Explain quantum computing to a 5 year old."
}
],
"temperature": 0.7,
"max_tokens": 4096,
"stream": False,
"top_p": 0.8
}

response = requests.post(
  url,
  headers=headers,
  json=data, 
)
content_type = response.headers.get("Content-Type", "")

if "application/json" in content_type:
  pprint(response.json())

else:
  for line in response.iter_lines(decode_unicode=True):
      if not line:
          continue

      if line.startswith("data:"):
          payload = line.replace("data:", "").strip()

          if payload == "[DONE]":
              break

          try:
              chunk = json.loads(payload)
              pprint(chunk)
          except json.JSONDecodeError:
              print("Raw chunk:", payload)
Congratulations! You have successfully run your first inference request to the Qubrid AI cloud.

Next steps